Combining Graph-Based Learning With Automated Data Collection for Code Vulnerability Detection
نویسندگان
چکیده
This paper presents FUNDED (Flow-sensitive vUl-Nerability coDE Detection), a novel learning framework for building vulnerability detection models. Funded leverages the advances in graph neural networks (GNNs) to develop graph-based method capture and reason about program's control, data, call dependencies. Unlike prior work that treats program as sequential sequence or an untyped graph, learns operates on representation of source code, which individual statements are connected other through relational edges. By capturing syntax, semantics flows, finds better code downstream software task. To provide sufficient training data build effective deep model, we combine probabilistic statistical assessments automatically gather high-quality samples from open-source projects. provides many real-life vulnerable complement limited available standard databases. We apply identify vulnerabilities at function level code. evaluate large real-world datasets with programs written C, Java, Swift Php, compare it against six state-of-the-art Experimental results show significantly outperforms alternative approaches across evaluation settings.
منابع مشابه
Automated software vulnerability detection with machine learning
Thousands of security vulnerabilities are discovered in production software each year, either reported publicly to the Common Vulnerabilities and Exposures database or discovered internally in proprietary code. Vulnerabilities often manifest themselves in subtle ways that are not obvious to code reviewers or the developers themselves. With the wealth of open source code available for analysis, ...
متن کاملData Mining for Automated GIS Data Collection
The automatic analysis of spatial data sets presumes to have techniques for interpretation and structure recognition. Such procedures are especially needed in GIS and digital cartography in order to automate the time-consuming data update and to generate multi-scale representations of the data. In order to infer higher level information from a more detailed data set, coherent, homogeneous struc...
متن کاملLearning graph affinities for spectral graph-based salient object detection
Computer vision and pattern recognition techniques based on graph theory constitute a wellestablished research area due mainly to their success in efficiently representing and solving many related problems such as image segmentation [1], [2] and saliency estimation [9]. Graph construction for the related problems is traditionally performed manually. This construction involves three major steps:...
متن کاملVulDeePecker: A Deep Learning-Based System for Vulnerability Detection
The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective...
متن کاملAutomated data collection for electron microscopic tomography.
A fundamental challenge in electron microscopic tomography (EMT) has been to develop automated data collection strategies that are both efficient and robust. UCSF Tomography was developed to provide an inclusive solution from target finding, sequential EMT data collection, to real-time reconstruction for both single and dual axes. The predictive data collection method that is the cornerstone of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Information Forensics and Security
سال: 2021
ISSN: ['1556-6013', '1556-6021']
DOI: https://doi.org/10.1109/tifs.2020.3044773